High-Dimensional Text Clustering by Dimensionality Reduction and Improved Density Peak
نویسندگان
چکیده
منابع مشابه
Using Dimensionality Reduction Methods in Text Clustering
High dimensionality of the feature space is one of the major concerns owing to computational complexity and accuracy consideration in the text clustering. Therefore, various dimension reduction methods have been introduced in the literature to select an informative subset (or sub list) of features. As each dimension reduction method uses a different strategy (aspect) to select a subset of featu...
متن کاملDimensionality reduction for density ratio estimation in high-dimensional spaces
The ratio of two probability density functions is becoming a quantity of interest these days in the machine learning and data mining communities since it can be used for various data processing tasks such as non-stationarity adaptation, outlier detection, and feature selection. Recently, several methods have been developed for directly estimating the density ratio without going through density ...
متن کاملAutomatic topography of high-dimensional data sets by non-parametric Density Peak clustering
Data analysis in high-dimensional spaces aims at obtaining a synthetic description of a data set, revealing its main structure and its salient features. We here introduce an approach for charting data spaces, providing a topography of the probability distribution from which the data are harvested. This topography includes information on the number and the height of the probability peaks, the de...
متن کاملClustering Including Dimensionality Reduction
In this paper new methodologies for clustering and dimensionality reduction of large data sets are illustrated using both a least-squares and maximum likelihood approach. The methodologies are described by both real applications and Monte Carlo simulations.
متن کاملA Systematic Study on Document Representation and Dimensionality Reduction for Text Clustering A Systematic Study on Document Representation and Dimensionality Reduction for Text Clustering
Increasingly large text datasets and the high dimensionality associated with natural language is a great challenge of text mining. In this research, a systematic study is conducted of application of three Dimension Reduction Techniques (DRT) on three different document representation methods in the context of the text clustering problem using several standard benchmark datasets. The dimensional...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
ژورنال
عنوان ژورنال: Wireless Communications and Mobile Computing
سال: 2020
ISSN: 1530-8677,1530-8669
DOI: 10.1155/2020/8881112